{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "# 案例1：解线性方程组\n",
    "\n",
    "## 例1\n",
    "\n",
    "求方程组的解：\n",
    "\n",
    "$\\begin{cases}-x_1 + 3x_2 - 5x_3 &= -3 \\\\2x_1 -2x_2 + 4x_3 &= 8 \\\\ x_1 + 3x_2 &= 6\\end{cases}$\n",
    "\n",
    "用矩阵的方式，可以将方程组表示为：\n",
    "\n",
    "$\\begin{bmatrix}-1&3&-5\\\\2&-2&4\\\\1&3&0\\end{bmatrix}\\begin{bmatrix}x_1\\\\x_2\\\\x_3\\end{bmatrix}=\\begin{bmatrix}-3\\\\8\\\\6\\end{bmatrix}$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[-1  3 -5]\n",
      " [ 2 -2  4]\n",
      " [ 1  3  0]]\n",
      "int64\n",
      "----------------------------------------------------------------------------------------------------\n",
      "1 2 3;4 5 6;7 8 9\n",
      "----------------------------------------------------------------------------------------------------\n",
      "[[1 2 3]\n",
      " [4 5 6]\n",
      " [7 8 9]]\n",
      "int64\n",
      "----------------------------------------------------------------------------------------------------\n",
      "[[1]\n",
      " [2]\n",
      " [3]]\n",
      "----------------------------------------------------------------------------------------------------\n",
      "[[ 2.25]\n",
      " [ 0.25]\n",
      " [-0.5 ]]\n"
     ]
    }
   ],
   "source": [
    "#csb_print_codes\r\n",
    "import numpy as np\r\n",
    "A = np.mat(\"-1 3 -5;2 -2 4;1 3 0\")\r\n",
    "print(A)\r\n",
    "print(A.dtype)\r\n",
    "print(\"-\"*100)\r\n",
    "a = \"1 2 3;4 5 6;7 8 9\"\r\n",
    "print(a)\r\n",
    "print(\"-\"*100)\r\n",
    "a_1 = np.array([[1,2,3],\r\n",
    "                [4,5,6],\r\n",
    "                [7,8,9]])\r\n",
    "print(a_1)\r\n",
    "print(a_1.dtype)\r\n",
    "print(\"-\"*100)\r\n",
    "b = np.mat(\"1 2 3\").T\r\n",
    "print(b)\r\n",
    "print(\"-\"*100)\r\n",
    "solve = np.linalg.solve(A,b)\r\n",
    "print(solve)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# 引入模块\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "matrix([[-1,  3, -5],\n",
       "        [ 2, -2,  4],\n",
       "        [ 1,  3,  0]])"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 创建系数矩阵\n",
    "A = np.mat(\"-1 3 -5; 2 -2 4; 1 3 0\")\n",
    "A"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[1]\n",
      " [2]\n",
      " [3]]\n"
     ]
    }
   ],
   "source": [
    "#csb_print_codes\r\n",
    "b = np.mat(\"1 2 3\").T\r\n",
    "print(b)\r\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "matrix([[-3],\n",
       "        [ 8],\n",
       "        [ 6]])"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b = np.mat('-3 8 6').T                  # 常数项\n",
    "b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "matrix([[ 4.5],\n",
       "        [ 0.5],\n",
       "        [-0. ]])"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 求解\n",
    "r = np.linalg.solve(A, b)    \n",
    "r"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "## 例2\n",
    "\n",
    "$\\begin{cases}x_1+3x_2-4x_3+2x_4&=0\\\\3x_1-x_2+2x_3-x_4&=0\\\\-2x_1+4x_2-x_3+3x_4&=0\\\\3x_1+9x_2-7x_3+6x_4&=0\\end{cases}$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 0.]\n",
      " [ 0.]\n",
      " [-0.]\n",
      " [ 0.]]\n"
     ]
    }
   ],
   "source": [
    "# 求解，得到的是 0 解\n",
    "A = np.mat(\"1 3 -4 2;3 -1 2 -1;-2 4 -1 3;3 0 -7 6\")\n",
    "b = np.mat(\"0 0 0 0\").T\n",
    "r = np.linalg.solve(A, b)\n",
    "print(r)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "使用 sympy 求解，需要安装此第三方库.\n",
    "\n",
    "如下安装方法是在 aistudio 中的安装，如果在本地安装，直接运行：\n",
    "\n",
    "`!pip install sympy` 即可"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple\n",
      "Collecting sympy\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/78/43/33c5a5e7fbafbf51520f4e09cb0634a1ca1d4cd5469c57967e43183d7a42/sympy-1.9-py3-none-any.whl\n",
      "Collecting mpmath>=0.19 (from sympy)\n",
      "  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/d4/cf/3965bddbb4f1a61c49aacae0e78fd1fe36b5dc36c797b31f30cf07dcbbb7/mpmath-1.2.1-py3-none-any.whl\n",
      "Installing collected packages: mpmath, sympy\n",
      "Successfully installed mpmath-1.2.1 sympy-1.9\n"
     ]
    }
   ],
   "source": [
    "!mkdir /home/aistudio/external-libraries\n",
    "!pip install sympy -t /home/aistudio/external-libraries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import sys \n",
    "sys.path.append('/home/aistudio/external-libraries')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/latex": [
       "$\\displaystyle \\left\\{\\left( \\frac{x_{4}}{10}, \\  - \\frac{7 x_{4}}{10}, \\  0, \\  x_{4}\\right)\\right\\}$"
      ],
      "text/plain": [
       "{(x4/10, -7*x4/10, 0, x4)}"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 上述安装完毕，并加载到搜索路径之后，执行下述代码。\n",
    "from sympy import *\n",
    "from sympy.solvers.solveset import linsolve\n",
    "x1, x2, x3, x4 = symbols(\"x1 x2 x3 x4\")\n",
    "linsolve([x1 + 3*x2 - 4*x3 + 2*x4, \n",
    "            3*x1 - x2 + 2*x3 - x4, \n",
    "            -2*x1 + 4*x2 - x3 + 3*x4, \n",
    "            3*x1 +9*x2 - 7*x3 + 6*x4], \n",
    "        (x1, x2, x3, x4))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "这就是该线性方程组的通解，如果对应到未知量，即：\n",
    "\n",
    "$\\begin{cases}x_1=\\frac{x_4}{10}\\\\x_2=-\\frac{7}{10}x_4\\\\x_3=0\\\\x_4=x_4\\end{cases}$\n",
    "\n",
    "其中 $x_4$ 是自由变量。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "# 案例2：假设检验\n",
    "\n",
    "假设有人制造了一个骰子，他声称是均匀的，也就是假设分布律为：\n",
    "\n",
    "$H_0:P(X=i)=\\frac{1}{6}, (i=1,2,3,4,5,6)$\n",
    "\n",
    "为了证明自己的判断，他做了 $n=6\\times10^{10}$ 此投掷试验，并将各个点数出现的次数记录下来（为了便于观察，次数用 $10^{10}$ 加或减一个数表示。\n",
    "\n",
    "| 点数 | 1                | 2                         | 3                     | 4                       | 5                       | 6                       |\n",
    "| ---- | ---------------- | ------------------------- | --------------------- | ----------------------- | ----------------------- | ----------------------- |\n",
    "| 次数 | $10^{10}-10^{6}$ | $10^{10}+1.5\\times10^{6}$ | $10^{10}-2\\times10^6$ | $10^{10}+4\\times10^{6}$ | $10^{10}-3\\times10^{6}$ | $10^{10}+0.5\\times10^6$ |\n",
    "\n",
    "接下来利用 $\\chi^2$ 检验法，通过如下程序对此人的结论——假设——进行检验。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "the isf is: \n",
      "11.070497693516355\n",
      "----------------------------------------------------------------------------------------------------\n",
      "the cdf is: \n",
      "0.9499903813775946\n"
     ]
    }
   ],
   "source": [
    "from scipy.stats import chi2\r\n",
    "isf = chi2.isf(0.05,5)\r\n",
    "cdf = chi2.cdf(11.07,5)\r\n",
    "print(\"the isf is: \\n{}\".format(isf))\r\n",
    "print(\"-\"*100)\r\n",
    "print(\"the cdf is: \\n{}\".format(cdf))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Power_divergenceResult(statistic=3250.0, pvalue=0.0)"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 进行卡方检验\n",
    "from scipy.stats import chisquare\n",
    "chisquare([1e10-1e6, 1e10+1.5e6, 1e10-2e6, 1e10+4e6, 1e10-3e6, 1e10+0.5e6])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "输出结果显示，检验统计量的值 $\\chi^2 =3250.0$ 。根据如下公式：\n",
    "\n",
    "$\\chi^2 = \\sum_{i=1}^k\\frac{(np_i-f_i)^2}{np_i}$\n",
    "\n",
    "此处 $𝑘=6$ ，则在 $\\alpha = 0.05$ 的显著水平下，$\\chi^2_{0.05}(6-1)$ 的值为："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "11.070497693516355"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from scipy.stats import chi2 \n",
    "chi2.isf(0.05, (6-1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "由输出值可知，在显著水平 $\\alpha = 0.05$ 下，$\\chi^2\\gt\\chi^2_{0.05}(6-1)=11.07$ ，故试验数据不支持“骰子均匀”这个假设。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.0\n"
     ]
    }
   ],
   "source": [
    "# 也可以得到 p 值\n",
    "p_value = 1 - chi2.cdf(3250.0, (6-1)) \n",
    "print(p_value)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "得到的 p 值结果说明拒绝原假设犯错误的概率是 0.0% "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "# 案例3：文胸商品评论数据\n",
    "\n",
    "**商业场景**：在网上开店，应该如何决策\n",
    "\n",
    "![](https://ai-studio-static-online.cdn.bcebos.com/70d49d9dcdca4e9980308c2665fc564956b5be7217e9478694ad724c8cddee52)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "### 第1步：利用网络爬虫，从京东官方网站上，获得有关文胸产品的评论数据\n",
    "\n",
    "爬虫代码和方法，略。但是，特别建议同学们自己写出相应代码。\n",
    "\n",
    "### 第2步：对获得的原始数据进行分析"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>creationTime</th>\n",
       "      <th>productColor</th>\n",
       "      <th>productSize</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2016-06-08 17:17:00</td>\n",
       "      <td>22咖啡色</td>\n",
       "      <td>75C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2017-04-07 19:34:25</td>\n",
       "      <td>22咖啡色</td>\n",
       "      <td>80B</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2016-06-18 19:44:56</td>\n",
       "      <td>02粉色</td>\n",
       "      <td>80C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2017-08-03 20:39:18</td>\n",
       "      <td>22咖啡色</td>\n",
       "      <td>80B</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2016-07-06 14:02:08</td>\n",
       "      <td>22咖啡色</td>\n",
       "      <td>75B</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          creationTime productColor productSize\n",
       "0  2016-06-08 17:17:00        22咖啡色         75C\n",
       "1  2017-04-07 19:34:25        22咖啡色         80B\n",
       "2  2016-06-18 19:44:56         02粉色         80C\n",
       "3  2017-08-03 20:39:18        22咖啡色         80B\n",
       "4  2016-07-06 14:02:08        22咖啡色         75B"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 读入原始数据\n",
    "import pandas as pd\n",
    "datas = pd.read_csv(\"/home/aistudio/data/data14408/bra.csv\")\n",
    "datas.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(['22咖啡色', '02粉色', '071蓝色', '071黑色', '071肤色', '0993无痕肤色', '0993无痕黑色',\n",
       "       '071红色', '0993无痕酒红色', 'h03无痕蓝灰', '蓝灰色', '酒红色', '内裤酒红色', '内裤蓝灰色',\n",
       "       nan, '肤色', '藕荷色', '藕荷色(套装)', '玫瑰色', '深蓝色', '烟灰紫', '天蓝色', '黑色',\n",
       "       '红色', '香妃红', '杏肤色', '诱惑黑', '土豪金', '大红', '嫣紫', '皇家蓝', '粉晶色', '柔肤色',\n",
       "       '蓝灰', '大红色', '紫色无钢圈厚杯', '肤色无钢圈厚杯', '黑色无钢圈厚杯', '紫色套装', '爱心大红色',\n",
       "       '肤色套装', '爱心肤色', '爱心黑色', '紫色 单件', '紫色 套装', '肤色 单件', '黑色 单价',\n",
       "       '肤色 套装', '黑色 套装', '浅紫色', '内裤黑色', '内裤肤色', '浅蓝', '浅黄', '宝蓝色', '酒红',\n",
       "       '灰色', '浅粉色', '醇黑', '红条纹', '蓝色', '蓝条纹', 'PNK', 'LAV', 'BLK', 'GRN',\n",
       "       '浅紫色  ', '肤色    ', '浅紫色 ', '肤色 ', '枚红色 ', '粉色', '蓝色  ', '西瓜红',\n",
       "       '银灰色', '天蓝色  ', '8626黑色', '红色 ', '国色天香 红色【厚】', '磁石款  黑色【薄】',\n",
       "       '磁石款  肤色【薄】', '磁石款 肤色【薄】', '美背款 粉色【厚】', '磁石款  粉色【厚】', '经典款 粉色【薄】',\n",
       "       '磁石款 黑色【薄】', '磁石款 肤色【厚】', '国色天香 黑色【厚】', '磁石款  红色【厚】', '磁石款 粉色【薄】',\n",
       "       '天衣无缝 黑色【AB厚,C薄】', '美背款 黑色【厚】', '磁石款  黑色【厚】', '国色天香 深紫【厚】',\n",
       "       '经典款 粉色【厚】', '经典款 肤色【厚】', '磁石款  蓝色【厚】', '经典款 黑色【厚】', '经典款 黑色【薄】',\n",
       "       '国色天香 粉色【厚】', '天衣无缝 粉色【AB厚,C薄】', '经典款 肤色【薄】', '天衣无缝 肤色【AB厚,C薄】',\n",
       "       '粉红色', '碧绿色', '嫩黄色', '紫兰', '粉色 单件', '粉色 套装', '蓝色 套装', '大红色 套装',\n",
       "       '大红色 单件', '黑色 单件', '蓝色 单件', '浅紫', '紫色套装（其他颜色备注）', '粉色套装（含内裤）',\n",
       "       '虾粉'], dtype=object)"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 查看特征 productColor 下唯一值\n",
    "datas['productColor'].unique()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>creationTime</th>\n",
       "      <th>productColor</th>\n",
       "      <th>productSize</th>\n",
       "      <th>color</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2016-06-08 17:17:00</td>\n",
       "      <td>22咖啡色</td>\n",
       "      <td>75C</td>\n",
       "      <td>棕色</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2017-04-07 19:34:25</td>\n",
       "      <td>22咖啡色</td>\n",
       "      <td>80B</td>\n",
       "      <td>棕色</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2016-06-18 19:44:56</td>\n",
       "      <td>02粉色</td>\n",
       "      <td>80C</td>\n",
       "      <td>粉色</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2017-08-03 20:39:18</td>\n",
       "      <td>22咖啡色</td>\n",
       "      <td>80B</td>\n",
       "      <td>棕色</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2016-07-06 14:02:08</td>\n",
       "      <td>22咖啡色</td>\n",
       "      <td>75B</td>\n",
       "      <td>棕色</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          creationTime productColor productSize color\n",
       "0  2016-06-08 17:17:00        22咖啡色         75C    棕色\n",
       "1  2017-04-07 19:34:25        22咖啡色         80B    棕色\n",
       "2  2016-06-18 19:44:56         02粉色         80C    粉色\n",
       "3  2017-08-03 20:39:18        22咖啡色         80B    棕色\n",
       "4  2016-07-06 14:02:08        22咖啡色         75B    棕色"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 对上述数据要进行数据清洗，清洗方法：手工、程序\n",
    "# 注意：对业务越熟悉，数据清洗得就越“干净”，且符合分析所需。\n",
    "# 下面的数据是清洗之后的结果\n",
    "cleaned_datas = pd.read_csv(\"data/data14408/cleaned_data.csv\", index_col=0)\n",
    "cleaned_datas.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DISTRIB_ID=Ubuntu\r\n",
      "DISTRIB_RELEASE=16.04\r\n",
      "DISTRIB_CODENAME=xenial\r\n",
      "DISTRIB_DESCRIPTION=\"Ubuntu 16.04.6 LTS\"\r\n",
      "NAME=\"Ubuntu\"\r\n",
      "VERSION=\"16.04.6 LTS (Xenial Xerus)\"\r\n",
      "ID=ubuntu\r\n",
      "ID_LIKE=debian\r\n",
      "PRETTY_NAME=\"Ubuntu 16.04.6 LTS\"\r\n",
      "VERSION_ID=\"16.04\"\r\n",
      "HOME_URL=\"http://www.ubuntu.com/\"\r\n",
      "SUPPORT_URL=\"http://help.ubuntu.com/\"\r\n",
      "BUG_REPORT_URL=\"http://bugs.launchpad.net/ubuntu/\"\r\n",
      "VERSION_CODENAME=xenial\r\n",
      "UBUNTU_CODENAME=xenial\r\n"
     ]
    }
   ],
   "source": [
    "# 以上数据中所显示的 'color' 列是清洗之后的数据，后续分析所用数据即为此列\n",
    "# 下面绘制图像，但是 aistudio 中显示汉字是一个麻烦问题。所以先解决\n",
    "# 此外，在本地也存在汉字显示问题，可以参考如下做法\n",
    "\n",
    "# 如何解决汉字显示问题\n",
    "\n",
    "# # 当前操作系统是 Ubuntu\n",
    "!cat /etc/*-release"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "cmap  fangzheng  truetype  type1  X11\r\n"
     ]
    }
   ],
   "source": [
    "# 自带字体\n",
    "!ls /usr/share/fonts/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/usr/share/fonts/truetype/droid/DroidSansFallbackFull.ttf: Droid Sans Fallback:style=Regular\r\n"
     ]
    }
   ],
   "source": [
    "# 系统可用的 ttf 格式中文字体\n",
    "!fc-list :lang=zh | grep \".ttf\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib']"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# matplotlib 的存储路径\n",
    "import matplotlib\n",
    "matplotlib.__path__"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "cmb10.ttf\t\t\tDejaVuSerif.ttf\r\n",
      "cmex10.ttf\t\t\tLICENSE_STIX\r\n",
      "cmmi10.ttf\t\t\tSTIXGeneralBolIta.ttf\r\n",
      "cmr10.ttf\t\t\tSTIXGeneralBol.ttf\r\n",
      "cmss10.ttf\t\t\tSTIXGeneralItalic.ttf\r\n",
      "cmsy10.ttf\t\t\tSTIXGeneral.ttf\r\n",
      "cmtt10.ttf\t\t\tSTIXNonUniBolIta.ttf\r\n",
      "DejaVuSans-BoldOblique.ttf\tSTIXNonUniBol.ttf\r\n",
      "DejaVuSans-Bold.ttf\t\tSTIXNonUniIta.ttf\r\n",
      "DejaVuSansDisplay.ttf\t\tSTIXNonUni.ttf\r\n",
      "DejaVuSansMono-BoldOblique.ttf\tSTIXSizFiveSymReg.ttf\r\n",
      "DejaVuSansMono-Bold.ttf\t\tSTIXSizFourSymBol.ttf\r\n",
      "DejaVuSansMono-Oblique.ttf\tSTIXSizFourSymReg.ttf\r\n",
      "DejaVuSansMono.ttf\t\tSTIXSizOneSymBol.ttf\r\n",
      "DejaVuSans-Oblique.ttf\t\tSTIXSizOneSymReg.ttf\r\n",
      "DejaVuSans.ttf\t\t\tSTIXSizThreeSymBol.ttf\r\n",
      "DejaVuSerif-BoldItalic.ttf\tSTIXSizThreeSymReg.ttf\r\n",
      "DejaVuSerif-Bold.ttf\t\tSTIXSizTwoSymBol.ttf\r\n",
      "DejaVuSerifDisplay.ttf\t\tSTIXSizTwoSymReg.ttf\r\n",
      "DejaVuSerif-Italic.ttf\r\n"
     ]
    }
   ],
   "source": [
    "# 那么，matplotlib保存字体的路径就应该是/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/mpl-data/fonts/ttf，可以查看其中的字体\n",
    "!ls /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/mpl-data/fonts/ttf"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "SimHei\n"
     ]
    }
   ],
   "source": [
    "# 没有支持汉字显示的字体\n",
    "# 安装汉字字体。首先要下载支持汉字的字体，比如 simhei.ttf（自己到网上搜索下载）。此处已经上传到本空间了。\n",
    "\n",
    "import matplotlib.font_manager as font_manager\n",
    "fontpath = 'work/simhei.ttf'\n",
    "prop = font_manager.FontProperties(fname=fontpath)\n",
    "print(prop.get_name())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# 下载的字体文件simhei.ttf，在Matplotlib中，所引用的字体名称是SimHei\n",
    "# 创建字体对象\n",
    "\n",
    "from matplotlib.font_manager import FontProperties\n",
    "font = FontProperties(fname='work/simhei.ttf', size=16)    # 创建字体对象"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "([<matplotlib.axis.XTick at 0x7f65a4785ed0>,\n",
       "  <matplotlib.axis.XTick at 0x7f6578799b50>,\n",
       "  <matplotlib.axis.XTick at 0x7f656fd9c2d0>,\n",
       "  <matplotlib.axis.XTick at 0x7f656fd54e90>,\n",
       "  <matplotlib.axis.XTick at 0x7f656fd5d510>,\n",
       "  <matplotlib.axis.XTick at 0x7f656fd5dad0>,\n",
       "  <matplotlib.axis.XTick at 0x7f656fd65210>,\n",
       "  <matplotlib.axis.XTick at 0x7f656fd656d0>,\n",
       "  <matplotlib.axis.XTick at 0x7f656fd65c90>,\n",
       "  <matplotlib.axis.XTick at 0x7f656fd6d290>],\n",
       " <a list of 10 Text xticklabel objects>)"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYAAAAEBCAYAAABxK3LCAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAHphJREFUeJzt3XuYHVWd7vHvS0ISQCAJtBBzIREyIsoYsAXUcWTkFsKZCXqUizMQGOZEZ+AccbyBR0TBKI4X1DmKEyUSPCIwKpLBOEzk4vWoJBqRiwwtxiExkEhIQIFA4Hf+WGuTymbv7t3d1d2E9X6ep56uWrXqsvauql+tVat2KyIwM7Py7DDSO2BmZiPDAcDMrFAOAGZmhXIAMDMrlAOAmVmhHADMzArlAGBmVigHADOzQjkAmJkVygHAzKxQo0d6B3qz5557xvTp00d6N8zMtisrVqz4fUR09ZWvzwAgaRzwPWBszv+1iDhf0mXAa4FNOetpEbFSkoBPA3OAR3L6z/K65gHvy/k/FBGLe9v29OnTWb58eV+7aGZmFZJ+20m+TmoAm4HXRcQfJO0I/EDSt/O8d0XE15ryHwvMzMOhwCXAoZImAucD3UAAKyQtiYgHO9lRMzOrV5/PACL5Q57cMQ+9/YToXODyvNyPgfGSJgHHAMsiYkO+6C8DZg9u983MbKA6eggsaZSklcA60kX8J3nWAkm3SrpY0ticNhm4t7L46pzWLt3MzEZARwEgIp6MiFnAFOAQSS8FzgX2B14BTATeU8cOSZovabmk5evXr69jlWZm1kK/uoFGxEbgJmB2RKzNzTybgS8Bh+Rsa4CplcWm5LR26c3bWBgR3RHR3dXV50NsMzMboD4DgKQuSePz+E7AUcCvcrs+udfP8cBteZElwKlKDgM2RcRa4HrgaEkTJE0Ajs5pZmY2AjrpBTQJWCxpFClgXB0R10m6UVIXIGAl8NacfympC2gPqRvo6QARsUHShcAtOd8FEbGhvqKYmVl/6Nn8P4G7u7vD7wGYmfWPpBUR0d1Xvmf1m8Bmtv2Zfs63alvXqouOq21d9kz+LSAzs0I5AJiZFcoBwMysUA4AZmaFcgAwMyuUA4CZWaEcAMzMCuUAYGZWKAcAM7NCOQCYmRXKAcDMrFAOAGZmhXIAMDMrlAOAmVmhHADMzArlAGBmVigHADOzQjkAmJkVygHAzKxQDgBmZoVyADAzK1SfAUDSOEk/lfQLSbdL+mBOnyHpJ5J6JF0laUxOH5une/L86ZV1nZvT75J0zFAVyszM+tZJDWAz8LqIeBkwC5gt6TDgo8DFEbEf8CBwRs5/BvBgTr8450PSAcBJwEuA2cDnJI2qszBmZta5PgNAJH/IkzvmIYDXAV/L6YuB4/P43DxNnn+EJOX0KyNic0T8BugBDqmlFGZm1m8dPQOQNErSSmAdsAz4NbAxIrbkLKuByXl8MnAvQJ6/Cdijmt5iGTMzG2YdBYCIeDIiZgFTSHft+w/VDkmaL2m5pOXr168fqs2YmRWvX72AImIjcBPwSmC8pNF51hRgTR5fA0wFyPN3Bx6oprdYprqNhRHRHRHdXV1d/dk9MzPrh056AXVJGp/HdwKOAu4kBYI35mzzgGvz+JI8TZ5/Y0RETj8p9xKaAcwEflpXQczMrH9G952FScDi3GNnB+DqiLhO0h3AlZI+BPwcuDTnvxT4sqQeYAOp5w8Rcbukq4E7gC3AmRHxZL3FMTOzTvUZACLiVuCgFun30KIXT0Q8BrypzboWAAv6v5tmZlY3vwlsZlYoBwAzs0I5AJiZFcoBwMysUA4AZmaFcgAwMyuUA4CZWaEcAMzMCuUAYGZWKAcAM7NCOQCYmRXKAcDMrFCd/Bqo2XZl+jnfqnV9qy46rtb1mT1buAZgZlYoBwAzs0I5AJiZFcoBwMysUA4AZmaFcgAwMyuUA4CZWaEcAMzMCtVnAJA0VdJNku6QdLukt+X0D0haI2llHuZUljlXUo+kuyQdU0mfndN6JJ0zNEUyM7NOdPIm8BbgHRHxM0m7AiskLcvzLo6Ij1czSzoAOAl4CfAC4DuS/iTP/ixwFLAauEXSkoi4o46CmJlZ//QZACJiLbA2jz8s6U5gci+LzAWujIjNwG8k9QCH5Hk9EXEPgKQrc14HADOzEdCvZwCSpgMHAT/JSWdJulXSIkkTctpk4N7KYqtzWrt0MzMbAR3/GJyk5wFfB86OiIckXQJcCET++wngbwe7Q5LmA/MBpk2bNtjV2Qip8wfZ/GNsZkOjoxqApB1JF/+vRMQ3ACLi/oh4MiKeAr7A1maeNcDUyuJTclq79G1ExMKI6I6I7q6urv6Wx8zMOtRJLyABlwJ3RsQnK+mTKtleD9yWx5cAJ0kaK2kGMBP4KXALMFPSDEljSA+Kl9RTDDMz669OmoBeDZwC/FLSypz2XuBkSbNITUCrgLcARMTtkq4mPdzdApwZEU8CSDoLuB4YBSyKiNtrLIuZmfVDJ72AfgCoxaylvSyzAFjQIn1pb8uZmdnw8ZvAZmaFcgAwMyuUA4CZWaEcAMzMCuUAYGZWKAcAM7NCOQCYmRXKAcDMrFAOAGZmhXIAMDMrlAOAmVmhHADMzArlAGBmVigHADOzQjkAmJkVquP/CWxm2wf/P2brlGsAZmaFcg3AzKwm21vtyzUAM7NCOQCYmRXKAcDMrFB9BgBJUyXdJOkOSbdLeltOnyhpmaS7898JOV2SPiOpR9Ktkg6urGtezn+3pHlDVywzM+tLJzWALcA7IuIA4DDgTEkHAOcAN0TETOCGPA1wLDAzD/OBSyAFDOB84FDgEOD8RtAwM7Ph12cAiIi1EfGzPP4wcCcwGZgLLM7ZFgPH5/G5wOWR/BgYL2kScAywLCI2RMSDwDJgdq2lMTOzjvXrGYCk6cBBwE+AvSJibZ51H7BXHp8M3FtZbHVOa5duZmYjoOP3ACQ9D/g6cHZEPCTp6XkREZKijh2SNJ/UdMS0adPqWGWRtrf+yGY2/DqqAUjakXTx/0pEfCMn35+bdsh/1+X0NcDUyuJTclq79G1ExMKI6I6I7q6urv6UxczM+qGTXkACLgXujIhPVmYtARo9eeYB11bST829gQ4DNuWmouuBoyVNyA9/j85pZmY2AjppAno1cArwS0krc9p7gYuAqyWdAfwWOCHPWwrMAXqAR4DTASJig6QLgVtyvgsiYkMtpTAzs37rMwBExA8AtZl9RIv8AZzZZl2LgEX92UEzMxsafhPYzKxQDgBmZoVyADAzK5QDgJlZoRwAzMwK5QBgZlYoBwAzs0I5AJiZFcoBwMysUA4AZmaFcgAwMyuUA4CZWaEcAMzMCuUAYGZWKAcAM7NCOQCYmRXKAcDMrFAOAGZmhXIAMDMrlAOAmVmhHADMzArVZwCQtEjSOkm3VdI+IGmNpJV5mFOZd66kHkl3STqmkj47p/VIOqf+opiZWX90UgO4DJjdIv3iiJiVh6UAkg4ATgJekpf5nKRRkkYBnwWOBQ4ATs55zcxshIzuK0NEfE/S9A7XNxe4MiI2A7+R1AMckuf1RMQ9AJKuzHnv6Pcem5lZLQbzDOAsSbfmJqIJOW0ycG8lz+qc1i7dzMxGyEADwCXAvsAsYC3wibp2SNJ8ScslLV+/fn1dqzUzsyYDCgARcX9EPBkRTwFfYGszzxpgaiXrlJzWLr3VuhdGRHdEdHd1dQ1k98zMrAMDCgCSJlUmXw80eggtAU6SNFbSDGAm8FPgFmCmpBmSxpAeFC8Z+G6bmdlg9fkQWNJXgcOBPSWtBs4HDpc0CwhgFfAWgIi4XdLVpIe7W4AzI+LJvJ6zgOuBUcCiiLi99tKYmVnHOukFdHKL5Et7yb8AWNAifSmwtF97Z2ZmQ8ZvApuZFcoBwMysUA4AZmaFcgAwMyuUA4CZWaEcAMzMCuUAYGZWKAcAM7NCOQCYmRXKAcDMrFAOAGZmhXIAMDMrlAOAmVmhHADMzArlAGBmVigHADOzQjkAmJkVygHAzKxQDgBmZoVyADAzK5QDgJlZoRwAzMwK1WcAkLRI0jpJt1XSJkpaJunu/HdCTpekz0jqkXSrpIMry8zL+e+WNG9oimNmZp3qpAZwGTC7Ke0c4IaImAnckKcBjgVm5mE+cAmkgAGcDxwKHAKc3wgaZmY2MvoMABHxPWBDU/JcYHEeXwwcX0m/PJIfA+MlTQKOAZZFxIaIeBBYxjODipmZDaOBPgPYKyLW5vH7gL3y+GTg3kq+1TmtXbqZmY2QQT8EjogAooZ9AUDSfEnLJS1fv359Xas1M7MmAw0A9+emHfLfdTl9DTC1km9KTmuX/gwRsTAiuiOiu6ura4C7Z2ZmfRloAFgCNHryzAOuraSfmnsDHQZsyk1F1wNHS5qQH/4endPMzGyEjO4rg6SvAocDe0paTerNcxFwtaQzgN8CJ+TsS4E5QA/wCHA6QERskHQhcEvOd0FEND9YNjOzYdRnAIiIk9vMOqJF3gDObLOeRcCifu2dmZkNGb8JbGZWKAcAM7NCOQCYmRXKAcDMrFAOAGZmhXIAMDMrlAOAmVmhHADMzArlAGBmVigHADOzQjkAmJkVygHAzKxQDgBmZoVyADAzK5QDgJlZoRwAzMwK5QBgZlYoBwAzs0I5AJiZFcoBwMysUA4AZmaFGlQAkLRK0i8lrZS0PKdNlLRM0t3574ScLkmfkdQj6VZJB9dRADMzG5g6agB/ERGzIqI7T58D3BARM4Eb8jTAscDMPMwHLqlh22ZmNkBD0QQ0F1icxxcDx1fSL4/kx8B4SZOGYPtmZtaBwQaAAP5D0gpJ83PaXhGxNo/fB+yVxycD91aWXZ3TzMxsBIwe5PJ/FhFrJD0fWCbpV9WZERGSoj8rzIFkPsC0adMGuXtmZtbOoGoAEbEm/10HXAMcAtzfaNrJf9fl7GuAqZXFp+S05nUujIjuiOju6uoazO6ZmVkvBhwAJO0iadfGOHA0cBuwBJiXs80Drs3jS4BTc2+gw4BNlaYiMzMbZoNpAtoLuEZSYz1XRMS/S7oFuFrSGcBvgRNy/qXAHKAHeAQ4fRDbNjOzQRpwAIiIe4CXtUh/ADiiRXoAZw50e2ZmVi+/CWxmVigHADOzQjkAmJkVygHAzKxQDgBmZoVyADAzK5QDgJlZoRwAzMwK5QBgZlaowf4aqJnZs8b0c75V27pWXXRcbet6tnINwMysUA4AZmaFcgAwMyuUA4CZWaEcAMzMCuUAYGZWKAcAM7NC+T2AIeL+yGb2bOcagJlZoVwDMKuZa3+2vXANwMysUMMeACTNlnSXpB5J5wz39s3MLBnWJiBJo4DPAkcBq4FbJC2JiDuGYnuuipuZtTfcNYBDgJ6IuCciHgeuBOYO8z6YmRnDHwAmA/dWplfnNDMzG2aKiOHbmPRGYHZE/F2ePgU4NCLOquSZD8zPky8C7hri3doT+P0Qb8Pbf/Ztu/Ttl1z2Era/T0R09ZVpuLuBrgGmVqan5LSnRcRCYOFw7ZCk5RHRPVzb8/afHdsuffsll93b32q4m4BuAWZKmiFpDHASsGSY98HMzBjmGkBEbJF0FnA9MApYFBG3D+c+mJlZMuxvAkfEUmDpcG+3F8PW3OTtP6u2Xfr2Sy67t58N60PgkSZpbERsHun9MCuBpOcBWyLisZHeF2vtOfdTEJK+I+ncFumzgQ2SxnewjjMlLZS05wD3YZykXQay7GBI+oKkL7RIP1HSHpLOkfSmPtYxqLKPJEn7SRpUrXZ7Lv9Ik/Svkj5aSboG6Oht/9I/95Eq/3MuAACPAq3uOO4CdgZe09vCkiYC7wceiIiBdtM6GPi9pN0GsnB/D4bKRe/RPNAIQJLGAf+X9BLeZOAvelnPoMuef+rj8cr0YZI2V6YnSdq9l+UHcyJ8GfjXyrom5qAwVdKUyrCvpL1abHtA5VdyRuN7kDRK0s6S+nV+DcVFQNJuks6StG/d25e0U1PAfYR8/FWmN1Tyj5K0U4v11HHOIenzkr5ZmT5J0n0dLjugz17SeZKiw+F9bdZRS/kHJCKeUwPwTeDsPH4sEL0M9zctK+DrveQ/t5ft7gzsmMdPBn7Rx37uCOzcIn0icD/wkQ7Luyfwn6QL/KfysA/wAPBq0s9uPEB63nMicHeb9Qy47Hn5z+XP+3BgU96vKcAsYCPpof8ppL7P17VZR7/K3rTs/qQLzsxK2tl535/Mf5+qjH+qxu++O6/3sDz9WuAJ4KFc9sbwB+DhISj74037+r6m4yOAY/pYR7+3D9wA/LFSvsdJN1+N6Sfyd9KYfgxYWfNxd2dl/ZvzPjSm/5i/88b0I8DuNX/27wFuJp3/r8tplzWOL9IvHeyS87y77vNusMOQrXg4B2AP4P8AHwfuBm4E/hk4EljVZpnjgdVNX8SngN8Ck5vyfhX4HrBDL/uwqpcvsd1Qy8kA/BC4iq0B4AJSjWdH0sOmL1Y+p83kC1VdZc/5bgVOJwWAyCffFlIAeBy4B/gZ8JY2J+FgLwRfAt6fx2cBbwN2YmtQvhn4QB4f1Uiv6bv/BPBF0kVgTE7bqTFeyfd64K4hKPvDwOGVcr6rMm9sXsdBvSxfy0UIuAL4cGX6W+Sbser+1Hzc7dzIk9d1WdM5viqP75DzqubP/h/zZ/4nOf+LyAGA1BLwGDAj53l73efdYIchW/FwDsAE4EOkatRT+cC7AJhOuijtVP3i80nxIuCMPD0V+Dbp7nT/pnXPySfYC/vYh+nANGDvfEB9OI+3GyYDU+s4GIC/BlaSfmjv08AK0t32bqS70D+v5P0y8P8an0cdZc95VwKnkQLAxkr6rHxiHFhJ2x/Yqa4TIW/zFrZe7G8CvkPlAkwlADQtO6jyk+7uVgEvyOX/CSnA3EhqDqnWAB4Fbm5avo6L4Ca2DQBnk2p/Pfl7CdINwS/yvr67zu1X8h8IvKIy/VZgRpu8dZxzuwCjKtNtA0Ce3qHO4y7ne0/jOyXV/CeyNQDsA/xV5Xt5d53lr2MY0pUP90CKuEHlroOtF6Dm4bRKnsXAHaSmlHZ3AtXhxX3sx0rguMr0PsBtwGvqPhmAvfLJfjDwlTy8khSMzs3lGl8ZXk66M7+MVEOopey5zH9b2d+jSHe8B+blqsHuX4Bv1XEi5DKsJ91pPUa6AVgH7NWU72ZaB4BBlR/4IPChPP594C297Ou5wNfq+N6b8m5s2sfmu+4AZtV53FXyHQHcmcc/AlyRx2fk7+JteXoHtq2RDfq4y9/7w2wbZFs1BTWGh4B1NX/253W4/8G2TXO1XnMGOgzrBXqoB9Jdd5CaQxZUDrxtqn7AOLZtAmjkuZltg8fZVO7YSBfQ6O3AyOt5gnSH/wZS+/CL83L75TwHAyfWcTAA/53UDvoL0gVwfV7XAtKd4V0tlv85cB8wc7BlB8aQftRvC+mB353Ad4HrSDWynYAf5f1aRfoxwPvYemdUx4XgKOBlwL6kE3pOizw30zoADLj8wEtIF7lbSXf+v6ysb8cW27qIFPzG5Hx1Bd9n1ACa5rcLAHV89oeztZnlA+Q7cOCfSBfixsX3YdJFeV7d51yLfdqmJtAmz5BdgKk8A+glz5CVvz/Dc6YXUO7t8j+AtaSq3bskTSfdIY8Fdpc0PncDHZen9waIiKci4pF+bO6pXua9ClgTEWuAC0l34xPyvIfy3/9JeiDbcDopUPyO1E6oiBDwduC7lenGep7uVRMRX4+IF5MewI4GvkG6K/kRsDgiXkSqFXwkr+M60nOBl0bE3YMte6Sf9X4tsFtETCTVuL5JahP9bkQ8CrwX+DfgjRExNSL2jojGT4AMuOyVfVhGugh/BlgY6WXDjgyy/P9JanL7aN6v8yKiERA2StooaVPuAbKJ1CRyMun5yIF1lD0b1Zwg6UhJb5Z0Uk46JveK+RtJf5fT6tp+87b3A84i3ZwcB1wYEbtGxJiIWAz1nnOSVuTPeqOkjcDfA2+upkm6v2mxWsouaVdJe0uaULm+jAHGNqbzvL0l7dpYruZrzoA9l/4n8DtJd5hrSBe/jaQ28J+R2l6fzNOPke5EdiQFi/2a1nOxpIurCZKiKU9vgXMucJOkA0ht3VeTLpAAu0p6hFQzeLo/fr5oPCKpk3JC64PhfNKFQMDngQ9GxP/K88YDy/P4WODxaN3dbEBlj4hf57wfJT18HUv6bBdLGpunF9Dil13rKHvu8voxUjkvl3Q4qQnoqk5XmvWr/BHxBPB2SbNIDze/mdP3k7RLRPwxXxAeJP0640ZJ42LbF6MG+71DiwBAei50eGV/30G6ARlNugH6Yk3H3Zg8kNc9mnTTtTAirpN0EPAhSb+LiK+2We9gz7nRpLvoy/KynwLGR8Rpefpw0k3J02oqO6Rmz38iXVca+7xzHj85T4v0mb+bdMPQbLDlH7ihqFYM90BqYnmM1Mf/6W6geV616WcVlbb/Fuu5mabqc9P8RnWsXVVcpCaOk0lVue/n9I/k5U4EXke6ED7j4VLefifV0eZmmL/M5f83UvX3I8Cv2do7YhXpZ7ghNc+cUnfZc54/B07I38eXSEF3Ian30WtID0bHtFl2QGXPy55Qmb+O9Dziihbr/0Dd3z2wK+km49hK2kRS19vpleXmApNy3r+vseyjW+SrHv+vzWkX91H2gW7/+Wzt/vhK4OgWeY4kXZBrP+dynpWkG45f5WEDqVmsMf1fVDom1Fj2aaRutuObhiuAS1qk7w1Mqbv8gxmeK01Ad5M+xO+3mHe+pM9UEyR9QtI/DGA7D5EebvW0mf/fSLWMayPiU8CRSrcYc0k9DU6NiBuBAyLdgbTydHW0eWBrdXRspSyzSM88Pka66EPq6bMIGCfpSNId2Q8ryz79olY/9FV2SA9BN5FexnoZqTnsraTq873AS2l9B9TQr7JXLAH+FBgXEc+PiFkR8eb+FK4D7cr/MeAg4GOSfiXp9IjYAFxKql2+LOe7hNT76hpSG3GzgZa98eLS1Jz3u03zj8z7/Jfq/XZ3oNvvAm7ITS/fBq7PzV6Npq+NwNeAByW9pJftt9PJcTea1F1z/4jYH7gcuKYyfWof2xho2X9OCi6rmoY5pJvA5vR7gGv72JdmnZR/wJ4TASAitkTE55vTJZ1IatP7StOs60gn7Cn93M5TEbEKeEVu2mj2TuCqyG17kX536M2kXkDHAYdLOiYinuzPdiuecTBExEpSzeKDlbQ7ImIBqZvcJcAnI+LhPHsXBhAA+iq7pL8ivUzz76SHo5NJXTEfy/t7PSkIzu/v5571diJszts+UNI8SZ+W9MPc/vqMY1zSaKXfqelYL+X/F+AY4Kh8wflSzv9u0mfQeDP5etIF4MORnov0R29l35/0TGFN84z8XOw00sPZR0nt3gPR2/YfBYiI8RHR+JmVffLQnN6fNm/y8n2dc5Dukj/X7hkA6Xwf199tZ23LHhF7AH8DvDKXsYvU0WM86bt/faX8RwEHR8TL+7PxDss/cENRrRjJgfRLo+8Cjib1PHgt6aI3i1Qt/+uc702kE3Jc07JLSFW15upbY5iRl/t403bnkpo8Dqik/Rnp7c935un3kqqn7bqD3kwv1cFKvldReaGmkv5p4NN5/A2kavESUhvxoaS70i20eCloMGXPyz+fdIH5R9LDv31Id8ffzPNPJZ0gp5Haw2srO/BjUjX5PtId1nmku7BGH/iWQ83l78r79mJS08BVpFrPq/L2ppBqaF8Fnldj2S8AVrRaF6kTwq9Jz7sab2Hv32K9A9o+qa173+pnmcva+Lya0/el6e33wX7ulfWMBf4B2LOSNop0LXjG8VbTZ78T6fnOCXn6HcBNeXxaLvO0PH0FsLTu826wQ60rezYMpNfT35e//JfmtD1Id6L3UHkxhcpLIXl6Ts7zIM/sQ9wYfkeqYu5SWW5PUjfHL+fpSaQuqZvJ3VFzukhvjG4hNdG8hsqzgMEeDMAXSA/fIDW3/G/yizJ52X8G3tDmcxtQ2SvL70a64DxBqgVBehZxDbA76UWtjaRnI6PrPBFI70G06qI4hXQn1vwS3lRyl9wavvuV+XveRGp+eSMp8N6Ytz2RrRfFF5OC1ANsexwOqOykGvzdpF5Iu+fvvIfUG+5E0nE2p3Ls/QeppvCqOj77XI7ml916Gx4G1td83HWRbjr+K3/ub6jMe2H+HrYAy0gPxpvP+cEcd+/Ln3/jWduR1fKROl4cn8cPIt0MTq2z/IO+Xo7kxXo4B1rcMde8/j8lXXBeQLrLvws4ok3eU0h9579H5WI40gfDIMp+dj64f0Sq9exGeubwMFtrJMon4H3A7aRuo9V1bK9lP4p04a0G8v1IF+cDSTWBJxrHHykonFdH2fMx9yhwAFtfuNtA6g67hcrPMuT8u5MeQm+mUhPYjj/7V+dyLs/nVLsOBi8k/UzMBvJ7CHWUnXRjeVBlehwtfuakMn/Xkf7Mmoei/h/AcJF0IPCrSN0E2+UZB+wR6X2B7Vp+32JW5G6QOe39pLvNqyLiD5X0CaReI18f7v0cbvmXJReReiRdOUTbmBQRa/P4YaQayQuA0yPivBb5dwcOifTuxHZP0ssjYkWHeXcGHov2HTCK4wBgZlao50QvIDMz6z8HADOzQjkAmJkVygHAzKxQDgBmZoVyADAzK9T/BxcHqTavdVMYAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# 绘制颜色分布柱形图\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "color_count = cleaned_datas.groupby('color').count()    # 分组统计\n",
    "numbers = color_count['productColor']\n",
    "labels = numbers.index\n",
    "position = range(len(labels))\n",
    "\n",
    "plt.bar(x=position, height=numbers.values, width=0.6, tick_label=labels)           # 绘制柱形图\n",
    "plt.xticks(position, labels, fontproperties=font)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(['75C', '80B', '80C', '75B', '70C', '85B', '70B', '85C', '75C/34C',\n",
       "       '80B/36B', '85C/38C', '85A/38A', '85B/38B', '80A/36A', '70A/32A',\n",
       "       '80C/36C', '75B/34B', '75A/34A', '70B/32B', '70C/32C', 'B80',\n",
       "       'B75', 'C80', '170/82/XL', 'C75', '160/70/M', 'B70', '165/76/L',\n",
       "       'C70', nan, '90C/40C', '90B/40B', '85D/38D', '85B+(内裤)套装',\n",
       "       '85E/38E', '80D/36D', '90D/40D', '80E/36E', '75E/34E', '90E/40E',\n",
       "       '75D/34D', '95C', '95E', '85E+(内裤)套装', '95D', '75B+(内裤)套装',\n",
       "       '75B=34B', '80B=36B', '80C=36C', '90D=40D', '85B=38B', '80A=36A',\n",
       "       '85C=38C', '90B=40B', '75A=34A', '90C=40C', '85A=38A', '75C=34C',\n",
       "       '85/38C', '75B/34', '85B/38', '80B/36', '70B/32', 'A75', 'A80',\n",
       "       'A70', '75A', '80A', '70A', '85A', '70A=32A', '70B=32B', 'A85',\n",
       "       'C85', 'B85', '90C', '40/90A=XL码', '34/75D=L码', '32/70B=S码',\n",
       "       '36/80B=L码', '38/85A=L码', '38/85C=XL码', '36/80C=L码', '38/85B=XL码',\n",
       "       '38/85D=XL码', '34/75B=M码', '34/75C=M码', '34/75A=S码', '40/90C=XL码',\n",
       "       '36/80A=M码', '75B=34B ', '34/75AB中厚2CM', '75B=34AB', '80B=36AB',\n",
       "       '75B  ', '38/85AB中厚2CM', '34/75C薄款0.8CM', '80B ', '85B=38AB  ',\n",
       "       '85B=38AB', '70B=32AB', '80A=36A厚杯', '70A=32A厚杯', '75A=34A厚杯',\n",
       "       '75B=34B（粉色预发货17号）', '75B=34B粉色预计4天发', '75B=34B（粉色预发货20号）',\n",
       "       '75B=34B（粉色预发货26号）', '34B/75B', '34/75B', '40C/90C', '32B/70B',\n",
       "       '34A/75A', '36C/80C', '34C/75C', '36B/80B', '34B=75B', '36A/80A',\n",
       "       '32A/70A', '38B/85B', '38A/85A'], dtype=object)"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 特征 productSize 的唯一值\n",
    "\n",
    "datas['productSize'].str.upper().unique()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>creationTime</th>\n",
       "      <th>productColor</th>\n",
       "      <th>productSize</th>\n",
       "      <th>size</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2016-06-08 17:17:00</td>\n",
       "      <td>22咖啡色</td>\n",
       "      <td>75C</td>\n",
       "      <td>C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2017-04-07 19:34:25</td>\n",
       "      <td>22咖啡色</td>\n",
       "      <td>80B</td>\n",
       "      <td>B</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2016-06-18 19:44:56</td>\n",
       "      <td>02粉色</td>\n",
       "      <td>80C</td>\n",
       "      <td>C</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2017-08-03 20:39:18</td>\n",
       "      <td>22咖啡色</td>\n",
       "      <td>80B</td>\n",
       "      <td>B</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2016-07-06 14:02:08</td>\n",
       "      <td>22咖啡色</td>\n",
       "      <td>75B</td>\n",
       "      <td>B</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          creationTime productColor productSize size\n",
       "0  2016-06-08 17:17:00        22咖啡色         75C    C\n",
       "1  2017-04-07 19:34:25        22咖啡色         80B    B\n",
       "2  2016-06-18 19:44:56         02粉色         80C    C\n",
       "3  2017-08-03 20:39:18        22咖啡色         80B    B\n",
       "4  2016-07-06 14:02:08        22咖啡色         75B    B"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# 简单的数据清洗\n",
    "\n",
    "size_1 = datas['productSize'].str.upper().str.findall('[a-zA-Z]').str[0]\n",
    "size_2 = size_1.str.replace('M', 'B')\n",
    "size_3 = size_2.str.replace('L', 'C')\n",
    "size_4 = size_3.str.replace('XC', 'C')\n",
    "size_5 = size_4.str.replace('AB', 'B')\n",
    "size_6 = size_5.str.replace('X', 'D')\n",
    "datas['size'] = size_6\n",
    "datas.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[None]"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAQgAAADuCAYAAADFnJnUAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMi4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvIxREBQAAIABJREFUeJzt3Xl8VNX9//HXncnKdlkCYTfsRhlQENy1uCtVv4prrU5rW2tt6/7VqF1u+7MtrXWp1n6r1iVa61J3jLu4tO67VwUJW4CwB7gBsk1mzu+PEySZJGQmzL13ZvJ5Ph55ALlz535Q8p5zzz2LoZRCCCE6EvC7ACFE+pKAEEJ0SgJCCNEpCQghRKckIIQQnZKAEEJ0SgJCCNEpCQghRKckIIQQnZKAEEJ0SgJCCNGpHL8LEC6wTAMYBAwACoFecb/u+H0AqAO2t/pq/edaLGeL1+WL9GHIZK0MZZl7ABOBMa2+RgEjgGFAXoqutBWo6uRrGZazLkXXEWlIAiITWOY4YBowveVrGjDQ15p22gB8CnzS8vU+lrPU35JEqkhApBvLzAUOAY4G9keHQX9fa0reBuA94F3gFeADLCfmb0miOyQg0oFllgDHA8cBRwB9fK0n9TYALwLPAy9gOZt8rkckSALCD5YZAGYBJ6JDYZK/BXkqBrwPPAdUYDkf+1yP2AUJCC9ZZikQBr6L7kwUsBAoB+7Hclb7XYxoSwLCbZY5CDgbOA+Y4XM16SyK7q+4D3gKy2nwtxwBEhDu0OMQjgUuAGaTukeOPcUW4GHgH1jOR34X05NJQKSSZRagbx8uA/byuZpsMR+Yi+W87HchPZEERCpYpgn8DLgYGOJzNdnqI+CPwOPyyNQ7EhC7wzIHA5cCPwVMn6vpKSqBG4ByLKfJ72KynQREd1hmL+BK4Cqgt8/V9FSrgeuBu7CcZr+LyVYSEMnQnY/nAr8DRvpcjdAWAldjOc/4XUg2koBIlGUeBtyEngsh0s/rwCVYzud+F5JNJCC6YpnjgT8Bp/hdiuhSFPg78EssZ7PfxWQDCYjOWGYOcA3wC2QcQ6apAS7Hcu73u5BMJwHREcvcGz38V24nMtvTwI9lzYruk4BozTKDwP8CFpDvbzEiRTYCP8FyHvO7kEwkAbGDZe6Jngewv8+VCHc8BPxU+iaSIwGhH11ehn50WeBzNcJda4AfYjnP+V1IpujZAWGZfdF9DfKEome5ET12Iup3Iemu5waEZU4CngRK/S5F+OIV4ExZ3WrXeua+GJZ5MnpVIwmHnuso4EMsc4rfhaSzntWC0Eu9/Qa4DjB8rkakhzrg+1jOo34Xko56TkDoKdkPohdwESLen4BrZCp5Wz0jICxzOPACEPK7FJHW5gFnyHJ3O2V/QOjOyBeBPfwuRWSE14CTsJxtfheSDrI7ICxzP/ReDEV+lyIyynvA8TKoKpufYljmIcCrSDiI5O0PvIFlDvW7EL9lZ0BY5tHo24p+fpciMlYI+E/LJsk9VvYFhGUej+5s6uV3KSLjjUeHRE/a+ayN7OqDsMyDgJeRcBCpVQ0cjOVU+V2I17InIPQaDv8BBvhdishKi4FDetraEtlxi6HvE19EwkG4ZzzwIpbZ3+9CvJT5AaH3pngJ2QxXuG8q8BSW2WMWE8rsgNDTtZ8HJvpdiugxDgfKW9YRyXqZGxB6UdknkHUjhffORM/dyHqZGxD6f9BRfhcheqwrscyw30W4LTOfYljm6YBMzxV+qwcOyObNejIvICyzFL3YSx+/SxEC/fhzPyzH8bsQN2TWLYbulHwCCQeRPsYD9/pdhFsyKyDgHmBPv4sQIs4pWOaVfhfhhsy5xbDMK4A/+12GEJ1oBo7Ect70u5BUyoyAsMxp6Dn6OX6XIsQurAWmYDkb/C4kVdL/FsMyc9H3eBIOIt0NBW7zu4hUSv+A0CtQy9LkIlOc2bKtQlZI71sMvWfBh0Cu36UIkYTVwF7Z8OgzfVsQeij1vUg4iMwzHL29X8ZL34CAq4BpfhchRDf9AMvM+KkA6XmLYZl7AR8DPWZarchKy4AQlrPd70K6K11bELcj4SAy3xjgt34XsTvSrwVhmacCj/tdhhAp0gSUYjlL/S6kO9KrBWGZefSQefaix8gDfud3Ed2VXgEBFwPj/C5CiBQ7s2WXt4yTPgGhFwO91u8yhHCBAfzR7yK6I30CAq5GVqUW2euIlk2dMkp6dFJa5nD0whuFfpcihItsYB8sJ+Z3IYlKlxZEGRIOIvuFgHP9LiIZ/rcgLHMgsBLZLk/0DIvQjz0zohWRDi2InyDhIHqOiUDGzPb0NyD0DkU/87UGIbx3ld8FJMrvFsQ56EU2hOhJBmKZRX4XkQi/A+Jyn68vhJdeA04C9sRyNvpdTCL8W8ZNPxPe27frC+GNCPAwcBOW82lXLy4pqzCAE4DzgXOWz53d4HJ9u+TnOo9X+HhtIdy2CbgD+CuWs7qrF5eUVRQC5wGXsnNrh7OA+9wqMBH+BIRljgGO9OXaQrirErgFuA/LqevqxSVlFcXojvoLgfh+iZ/TIwNCd04KkU3eAG4Cnk1kjENJWUUI3Qd3Np2vfTKtpKziwOVzZ7+TujKTIwEhRPdF0JtI34TlfNzVi1v6F45DB0Oiy9H9HPAtILwfSamnvX7g7UWFSKnNwJ3AbVhOdVcvLimrKEAPsb4U2CvJa0WAIcvnzt6SdJUp4EcL4rs+XFOIVFgM/AW4N5F1Jlv6F36K7l8Y3M1r5gInAg908/zd4m1AWGYQ3TMrRCb5D7p/4ZkE+xcmo28jvkNq1ladg08B4e0thmUeC7zg3QWF6LZm4N/o/oUPEzmhpKxiR//C0SmupQEYvHzu7G0pft8ueX2L8R2PrydEsrYAdwG3YjmrunpxSVlFPjv7F9wa+FcAzAYecen9O+VdQFjmjh5cIdLRUnT/wj1YTpef1CVlFYPR/Qs/AYa4XBvo2wzPA8K7WwzL3Be9GY4Q6eQtdP/CUwn2L+yFvo04B/3J7pXt6NuMeg+v6ektRqrvy4TormbgMeBmLOf9RE4oKas4Bh0Mx6AXofVab+BY4CkvL+plQBzj4bWE6IjDzv6FlV29uKV/4RzgMmCyy7UlYg4eB4Q3txiWWYgeXCLb6Qk/LEP3L9ydYP9CEXBRy1exy7Ulw0EPmmry6oJetSAOQ8JBeO9tdvYvRLt6cUlZRSm6tXAu3vYvJMoEZgEvJnuiYRj/AzwJlCqlFiZ6XpcBYRhGFL1ctwFEgZ8ppd5Osj65vRBeiaL3dr0Jy3kvkRNKyiqOQvcvHIc//QvJOJxuBAR6Uth/W379daInJdKCqFdK7QNgGMaxwB9aikyGTO0WbqsF/oHuX6jq6sUlZRV56HE5lwFTXK4tlQ5O9gTDMPoAh6BbH/NIcUC01g/dl5A4y+xNenTwiOy0HLgV+AeWs7WrF5eUVQxCj134KZm5HuqMkrKK3OVzZ0eSOOdk4AWl1CLDMGoMw5iulPookRMTCYhCwzA+Rd+TDQOOSKIwgH2BYJLnCNGVd9H9C08k2L+wJ3q043lk9iZNhcA0IKHbpxZnoztpQS9/dzaQsoBofYtxIHC/YRiTVeKPPzJyV2ORlqLojrabsJyE1kgoKas4Et2/cDzp37+QqINJMCAMwxiI/lAPGYah0B/WyjCM/03kZzipWwyl1DuGYRShp66uT/C0aclcQ4gObAXuBv6C5Szv6sUt/Qtno/sXprpbmi9mJvHa04AHlFI/3vENwzDeAA4F3uzq5KQCwjCMPdEJVJPEadn4P0h4YwW6f+EuLKe2qxeXlFUMZGf/wjCXa/NTMh+6ZwN/jPve4y3f7zIguhwo1eoxJ+gm2rVKqYqESrPMHPQY8ryEXi+E9j66f+GxBPsXJqJbC+fRM7ZxVIC5fO7sLjtld1eXLQil1O50ME5EwkEkJoYeRnwTlvNWIieUlFXMQvcvzCZ7+hcSYQD7oBeycZXbIyk9fbx58zuN/OOTCAYQKg5w78mFrNmqOOvxOmrqFNOHB3nglELygm3/LT34eYQb3m785s+fr4vx8Y97U1oU4OSH61hVq7hoRh4XzdBZd8G8ei7cL49pw+ThTApsA+5B9y8s7erFJWUVuehVyS5DPyFLSKxhGzXP30rTxhUAFJ1wCfkjSr857rz3ONu/er3lxVEiNasY+fMHQcXY8MTviDVuo/+h59Jr4oEArH/8/zHwmIvI6Tso0RJSbRpZEBAlLr//N6prY9z6fhNfXdSHwlyDM/5dx8NfRHiuspnLDsjnrMm5XPhsPXd/HOEnM9o2as6Zkss5U3IBsNdF+Z9H6thnaJBnvo5wyOgcrj00j4PvqeOiGXl8tjZKNIaEw+5bCdwG3InlOF29uKSsYgB6bcefAcOTvdimV++kYOx0Bp9yLSoaQUUa2xw395+Duf8cAOoWv0ftB08TLOxL7YfP0Gff4+k18UDW/9ui18QDqVv8HnnFY/0MB/BoVzq3A2Kky+/fRnMM6pshN6ioi8CwPgHmL4vyrzn6sXd4ai7WG43tAqK1h76IcNbeOixyA1AXUUSisKOr5pevNfL3b6fjMP2M8QE7+xeau3pxSVnFBPT4hTB6ynPSYo3baVj5JYNOuAwAI5iLEczt9PXbv3qT3qWHtbw2BxVpREWbMQIBVCzK1g+fZvCcX3WnlFTy5GfL7YAY4fL777xQvwBXHpjH6Ju3UphrcMy4INOHB+hfADkBfUsxsl+A6tpdd8o+8mWEp8/S/VxHj8vhgc8jHHD3dv73oHye+TrCtGEBhvf1e8/jjBMDnkb3L/w3kRNKyioOR/cvfJvd3GS6ecs6gr36UfPcLTStX0b+0PEMOPICAnntgz4WaaBh2UcMPPpCAHrvdTgbn7mBbZ+9QP/Dv8fWjyvovfcRBHJ9/5Dw5Gcra1oQm+sVT3/dzLJL+tC/wOD0f9fzwuIuP6DaeG9VM71yDSYP0bcPOQGDf83RYRGJKo79Zx1Pn9WLy19sYIUT47ypuZw0qfNPIsE24F7gliT6F85A9y9MT1URKhalae0SBh51IfnDJ7HplTuoffff9D/s3HavrV/8PvkjSgkW9gUgkN+bIadbAEQbtlH77mMMPvU6ap6/lVjDNvrNPKVNX4aHkr7N6g63Pwo9a0G8srSZMf0DDO4dIDdocGppDm+tiLKlAZpjutWwqjbGiH6dd3Y//EUzZ0/u+Af+bx80cd7UXN5dFcXMN3jktEJufMezafmZZhVwNTAKy7m4q3AoKavoX1JWcTV6Xch/ksJwAMjpW0SwbxH5wycB0GvSwTStW9Lha7cveJPee3U8F9F56yHMg85g+1dvkD9ybwbNvpwt//1XKktNRlHLgjauci8g9B4Ynk2GGW0avFsdpS6iUErx6rIoew0OMmtMkMe+0i2J8s8inNzJJ35MKR79KsJZHQTE5nrFs5XNnDc1l7qIImCAYUB9xONdydLfR+gVmMZgOX/Ccna5G1RJWcW4krKK29CBMheXWpzBPgPI6VdEpEYvUt1Q9Rm5RaPbvS7WuJ3GlV9QOP6Adscim6qJbq2hYPQUVHOj/gdggGr29UPC9VaEm7cYQ/Fwktb+I3M4rTSHaXdsJycA+w4LcsH0XGZPzOGsx+r4xfwG9h0W5Af76gB45usIH66O8ttZ+l7yzaooo/oFGDugfWb+9o1Grjs0n4BhcOz4HG7/oI7Q/0W4cLoM8UD3L8xD9y90OTIPoKSs4lB0/8JJuN+KBWDgURey8dk/o6LN5PQfyqATLmXrJ88B0HffEwCoW/QOBSX7dtg3seXNB765Jeldejgbnrie2ncfwzzU121mR6BXy3KNe0vOWeZMkptxJjLLdvTW9LdgOYu7enFJWUUOO/sXZAJfapy1fO5sV5fCd7MFMcDF9xb+qQb+CtyB5XS5NkhJWUV/4AL0+IVRLtfW07jex+dmQMgalNnlY+Bm4BEsp8vFSkrKKsaixy98H+jjcm09lQSE8JUCnkX3L7yeyAklZRWHoPsXTsaj/oUeTAJC+KIOKEdvLFPZ1Ytb+hdOQ/cvJLNWgdg9A92+gASEaG01O/sXNnX14pKyChP4EfBzoP1zQ+E210fpSUAIgE/R8yMeTrB/YQxwCXA+0Nfl2lKmfulHbHr1TojF6DP1GMwDTm9zvPb9J9n2+UsQCBLs1Y9Bx19KjjmESM0qNs67ARWLMujYn5I/ohQVi7L+0V8xeM4v/Rx2neyCT63XdgF4WCk1N2UXSJIERHpTQAW6f+G1RE4oKas4CN2/8D9k2ELEKhZl08v/x5Azryen7yDWlF9G4fj9yWs1YCqveBxDwzcTyC1g6yfPsfn1exl88tVs/fR5Bhx1ATn9itn86p0MPqWUrZ88R++9Z/k9JyPZFsQ368smSgKi5zKAl7oKh5KyiiB6T8jLgPZDDDNE05pF5PQfRm5/Pbi3d+lh1Fe+2yYgCvbYuT1G/vBJbP9S/6f5ZkZncyMEgsQatlG/+H2GnPEbb/8S7bm+M56bF+hJK/xkqluxzCiW87eODs767Wn7HL6p6OrR6/KKDYL1MSPwWiyYo5SRQ9QIoowgsUCQWCColBEkZgRRgRyUEdC/N4KoQMuvRmDHr4YyAgojaCgjAAT0r/r7GBiGMoIYhpHSfz+LVlcXVQd7D5xVn7sI4Ct6D9m4bmG/w+pzOxzk9eY7T40fN7i0aWZ97gpn4gn5b744d89INGJ864hLKhe9dE/xQZNPrhndkN/lOhZuakZVJ3nKji0sdviDUmqXA63cDAjX18sTKfFXLDOG5fw9/sDGPb7+fOMeXzd92aRmTq5Si2cuUlv2rlIFRbWMCSiGuFmUAqWMQFQZgWZlBKOKQLMKBGL612BMH2v9azAaM4KxHcdiRlApIxhVOrxiFeuWF7zvrA5+9/NHCmJGMPbUmq9yFm7dEDhjQUVQ6deqmA4zNa/qoyHNq7/s+4vDLvwkr+qjmDICDWcdfMEnyggayzevKVi1aUX+yWMP237fY1dOao7FAqfvc1rV6IElDegQDCjDQBEI6FAMBDAChsIIKCNgYBhBfcwwwAgowwiAEdz5ewJg5CiDlu/rP6Nv6YLon9kghhEEPkvyP2vStxhuDrUOo4fiivSngB9jOXfFHwiVhwLA/ehJWN8o3qyqp1eqFfstVk1j1qpBvRqZYKTxbeWn9fXcvnEjd43SgznvrNELs18wqO2qUG9v387v16+jfNRoBuW0//y8fHU1FxcN5inH4eDevRmem8stGzZww3BPZl+3oTDe22vhVwnf9hmGsU0pldSgNWlBCNC3g3e0tCTubn3ADtuxUHkojB70dPaO768bYIx4bqYx4rmWUQ85zappz1Xqy5lfq5opy1Vu8RZGB2PeTffvyuSCAqoiTaxqamJIbi7Pb63lT8Pa/lB/1dDAb9at5Y6RozoMhw/q6hiSk0NJXh4NKoaB/o/SoGLe/CXiGKhktt/rFgkIsYMB3NXSJ3Ff6wN22I6GykPnon8ezuzo5OYcI++LEmPvL0p2fm/AVrV+2mK1fEalqh+/WvXvW894o5vLxu2uqFIEgdnLl6GUYlphIRPy87lt4wb2LijgiD59uWJ1NasjEU5ZvowgMKWwkHtGjWZZUyNXVldTFYlwy3Cdeaf2M/nOiiqG5+ZiFfu2xWdDkq+P74N4QSlVtqsT3LzF2B+9f6LILDHg+1jO/fEHQuWhHOAh9KjJpAViKjpuDUtmLoqt3WeJCg7fxPCcKCWGBx3aSinqlKJ3IEBEKb67ooprhxQztXDnNp3v1W1nSkEhhYEAD2/ezPv1ddw0fAR/XL+Oo/r0ZURuLn9Yv46/jBjJPzdvoncgwClmf7dL35VnSxcuONHNC0gLQsQLAPe2tCQebH3ADtvNofLQ2S2vOTXZN44FjGDlCCZWjghOfHCW/l6fOrV5n6VqycxFavukVapP/+2MN8BMwd+jDcMw6N3yYKRZKZo7+GDcv9fOxs2UwkLm1erNvHIMgwalaFCKHMOgNhrl9W3buHOk75NTG7t+ye5xswUxCr11mshMUeBcLOeh+AOh8lAu8Ch6wFRqKaX2WM+yGZVq9b6LY2r0BorzmhlvpGDiV1QpTqtazoqmJr4zYABXDO78Qcz169ZSlJPDhYOKWB2JcM2a1TQphVU8lKdqHWb16cPMXr7cLbV2X+nCBd938wJuBkRfoMv9FEVaiwLfwXIejT/QEhKPA642cQEKmtS2KcvU4hmLlLPXClU4aCtjAorB3X2/2miUi6urua64mAn57R+8POM4/GvLZu4fNZq8QNtcqmpq4i8bN3DtkGJu2LCeiFJcXDSYkjxfVhebW7pwwTVuXsC9gACwTAfo594FhAeagbOxnMfiD4TKQ3nAE+it7zw1rEat3K9SrZy+ONY8Zh1FBU1MMJIYevy3jRspCBicPzBzH3MCF5cuXHCbmxdwe6jmSjzaAUi4RndM6j6JJ1sfsMN2U6g8NAd4Ejjey6LWDDJGzRtkjJp3gP6Ez21WDXutUAtnfq02Ta5S+UO2sEdQ7dzhe1NzMzmGQb9gkIZYjLfrtvPDuHDItMecwJpkXtxqslYuOvjvB25WqvO/gNstiOfw+B+OcE0EOB3LeTr+QKg8VIDeGOcYz6vahUGOWjN9sarar1I1Ni6pGzZ3xdoJSmHEUBzXtx8XFRW1ecx5/soVVDY2UtQSDsNzcrl9pF5oWynFD1et5MbhI+gfDLKksZGr1qwmquBXxcVM6+XLpuKHlC5ckNBGx9B2oJRhGEOAfwFvKaV+3ek5LgfE34CfuHcB4bEIMAfLmRd/oCUk5gFHeV5VggIx1TyxmsqZX8c2TF2mcoZtYmROLKPXsRhbunBBwqtax4+kNAxjLHorxCLVSRC4HRBXAH927wLCB03AqVhORfyBUHmoEL1E3RGeV9VN/barmn2XqKUzFqm6SdWqX786xhuZscZFFCgoXbgg4e3jOhpqbRjGFmCSUmpdh+e4HBAnAs+4dwHhk0bgFCzn+fgDofJQL/Q6E9/yuqhUMJSKjVnL0hmLYmv3XarUyA0Mz40y1ovBXElaXrpwwZhkTkjHgNgTWODeBYSPGoGTsJyX4g+0hMTzwGGeV+WCXg3KmbpMLZmxSG0tXal6DdjKuIAH60F2YX7pwgVHJnNCOt5i5KE3WHF9YQvhiwbgRCznlfgDofJQb+AF4BDPq/LAyI2qar9FatW0xbFoyXoG50eYYHj77/wfpQsX/CiZE+I6KQcDDwLv+NdJCWCZnwJT3b2I8FE98G0sZ378gVB5qA/wInCQ51V5LC+i6iZXqcUzv1ab965SBYNrKQkoil285DWlCxfscj3JeB085nwAuMm/x5wAlnkHemclkb3qgNkd7Z0RKg/1BV4ig5er664hW9Tq6ZWqar9K1TQ29WtmzC5duOC5FL1Xp7wIiPOBu7t8nch024ETOtrAN1Qe6ge8TA/fMyMnqpomrVKVM79WNVOWqZziLYzOiXV7R/Pi0oUL1qe0wA54ERCTabvUtshe24DjsZz/xh8IlYdM4BVk4942+m9TG6YtVstmVKr6CdXK7FvPhATWzFhVunCBJ1NJvQiIALCFzHi2LHbfVuA4LOft+AOh8tAAdEhM87yqDBGIqehYvWbGun2WKmNEDSM6WDPjqdKFC07xoh73AwLAMucDs9y/kEgTtcCxWE67BYNC5aGBwKtAUoun9mR96tWWqXrNjG17rlR98iM8Mv3zBTd4cW2vAmIucLX7FxJppBY4Gst5P/5AqDw0CJgPTGl3lkjE0XbYbvdo2Q1e7b4sS8/1PP2Al7DMdn0OdtiuAY5E+qa6I4qHP09eBcRr6Oeuomcx0SHRrs/BDtsb0SHxpedVZbbP7LC9zauLeRMQluMA7Xq2RY8wAHgZy2zX52CH7Q3oiV1feV5V5vqPlxfzqgUBepaf6JkGAq9gme36HOywvR4dEgt39yIbX9pI5XWVVF5bycYXN7Y7vuXtLVT+opLKX1Sy5Pol1K+oB6C5tpmlv1tK5XWV1H60c5XEqr9UEdns+tYTyWo3Qc5NXgZEuzUERI8yCHgVywzFH7DD9jp0SCzq7ps3rGpg8xubGfercYz/f+PZ+tlWGte1XfQ5b3AeY68Zy4TrJzDkpCGsvm81AM57DgNnDWTcr8ax8SUdLLWf1FIwuoDcAcluoO2qbcDrXl7Qu4CwnEVApWfXE+moCB0S7ZYhtMP2GvSj8G79G2lc3Ujh2EIC+QGMoEHvSb3btAYAek3oRbB3UP9+XC8im1paB0GINcVQzQojYKCiipqXahh8QrfXxXXLK3bYdn2p+9a8bEGA3GYIGAzMxzJL4w/YYXs1OiSWJPum+SPzqVtUR/O2ZmKNMbZ+vpVITee3B5vf3EzfKXrsXv8D+lP7cS3LbljG4BMHs2n+Jvof1J9Avtc/Hl1qt0iP2yQghB+GoENiUvwBO2xXo0NiaTJvWDC8gKITilh+w3KW37icwtGFGIGO13jZtmAbm9/cTPEZerJlsFeQkstLGG+Np3CPQmo/raXfjH5U31PNir+uoG5xXdJ/QRcofAgIbwZK7WCZucB6wNf9ykTaWAN8q+X2s41QeWg08AZQ0p03XvvYWnIH5DLoyLYrVzesbKDq1ipKrighf2j7iZVrHlpD33360rSuCSPHwJxhsuK2FZRc2a0yUukjO2x7Po/F2xaE5USAhz29pkhnw4DXsMwJ8QfssL0C3ZKoSvTNmmv1UJummiZqP6yl/wFtP4eaappYcdsKRl0wqsNwaFzbSGRThD6lfYg1xb6Z/RBr8m1Z+9Z8aX1724IAWkbWfeDtRUWaqwYOx3La9T2EykNj0C2JLmcvLv39UqLbohhBg6FnD6XPXn3YNH8TAAOPGEj1PdU4HzrkDWrZBSsI463x35y/4vYVFM8pJn9oPs21zVTdWkWsLsaQU4Zgzkj5dqHJmmSH7W4/5eku7wMCwDI/Q8bhi7ZWom832vU9hMpD49CP97q7dkKme8cO276syuVXN60sICPijULfbpTEH7DD9hL07cZqr4tKE/f5dWG/AuKfeLB1ucg4o9EhsUf8ATtsL0aHRFLbzWWBenzst/MnICxnE3qrNiHilaBDol2fQ8s9+CxgrddF+ehJO2z0X7ttAAAI+0lEQVTXdv0yd/g5EkRuM0RnxqBDol2fgx22v0YPy+5wo5csdJ+fF/czIF4hycEwokcZhw6J4fEH7LC9AD1VfIPnVXlrJXr1Ld/4FxCWEwNu9O36IhOMR4fEsPgDdtj+Et2SaD9tM3v81Q7bvg7C8Huw+b1k/6eA2D0T0cOyh8YfsMP2F+iWRI3nVbmvFvi730X4GxCWUw/c6msNIhPsiQ6JIfEH7LD9OXAUsMnzqtx1h5+dkzv43YIAuB2dlkLsSik6JNrNwbbD9qfA0cBmz6tyRxNwi99FQDoEhOVsBm7zuwyREfZGh0RR/AE7bH8MHIPegyXTPdgy9d13/geEdhN6wxUhujIZvejMoPgDdtj+EDgWcDyvKnUU4MmeF4lIj4DQA6ekFSESNQW9xuXA+AN22H4fOI7M/cB5puUxblpIj4DQ/oReK0KIROyDXi17QPwBO2y/S2aGRAz4pd9FtJY+AaGXxr/O7zJERpmG3nej3QJEdth+GzgevdBrpnjQDttptZlQ+gSEdg/wkd9FiIyyH/AiltluwQY7bL8FnABs97yq5DUBv/K7iHjpFRB6dOXFfpchMs5M4AUss90O8nbY/g8wG0iLhSV34VY7bC/3u4h46RUQQMu28Q/6XYbIOAegQ6JP/AE7bL8BnIieOp2ONgDXJ3OCYRhDDcN42DCMJYZhfGQYxnOGYUxMdWHpFxDa1WRGs1Ckl4OA5zsJifnASUCD51V17dd22E740axhGAbwJPC6UmqcUmo6cA1QnOrC0jMgLKca+L3fZYiMdAhQgWX2jj9gh+1XgJNJr5D4BLgzyXNmARGl1DdzNZRSnymlUr5vZ3oGhPZn4HO/ixAZ6TDgWSyzV/wBO2y/BJxCeqxo1gz8wA7b0STPm4xHnfnpGxCW0wR8l/T4Hykyz7eAeVhmYfwBO2y/AMxBPznw05/tsP2JzzXsUvoGBIDl2KTZwBGRUY4AnsEyC+IP2GG7AjgN/0Lia+A33Tz3S2B6CmvpVHoHhHYj8KbfRYiMdRTwdCchMQ84A+h8E093KOCHdtjubl/IfCDfMIwLdnzDMIwphmEcmpLqWkn/gNBjI85DpoSL7jsGeBLLbLedlh22nwbORPcHeOX/7LD93+6erPRmNqcAR7U85vwS+AMuLObrz8Y53WGZ30OvQCVEd1UAp7b0b7URKg/NQS8vn+NyDcuAqXbYzoh5IunfgtjBcu4DnvC7DJHRZgOPYZl58QfssP04cA6Q7BOFZDQCp2dKOEAmBYR2PrDQ7yJERjsReLRlp/k27LD9KPrJmVshcYUdtjNqrlHm3GLsYJkTgfeAdjP4hEjCE8CZWE67vodQeegc4H5S+wH6qB22z0zh+3ki01oQYDmLgLNwtykost+pwENYZrs+BztsPwh8D70+QypUAj9M0Xt5KvMCAsByXkTP1xBid5wGPIhlBuMP2GH7AeAH7H5INJBh/Q6tZWZAAFjOjehmoBC74wzggU5C4j7gR+hxC911kR22P9uN832VuQGhXYDujxBid5wNlGOZ7X4e7LB9D/BjuhcSv7PDdkY/ms+8Tsp4esel/6L3chRidzwAfK9lcF4bofLQhcDfACPR97LD9nmpLM4Pmd6CAMtZi940JS32ERAZ7Vzg7k5aEn8Hfpbg+8xH919kvMxvQexgmXuj52y0WwpdiCTdA/wQy2n3wxEqD/2cXW8X+SVwcDILwKSzzG9B7GA5X5L5m6aI9HA+cAeW2e52wg7btwGXdnLeauD4bAkHyKaAALCcHTsrycQusbt+BPytk5D4C3BF3Lc3AsfZYXulF8V5JbsCAsBy3iMzN00R6ecIoN3GPAB22L4JuKrljzXAkem2p0UqZE8fRDzLPAB4Fmi3h6MQCfgCOArLWberF4XKQ5cCb6T7ylDdlb0BATvmbbwAjPG7FJFR9E7hllPjdyF+y75bjNb0vI0D0f/DhUjEi8AsCQctuwMCaGkiHo7+Hy/ErvwVmI3lSCd3i+y+xWhNz9r7BxD2uxSRdqLAJVjO7X4Xkm56TkDsYJm/BX5B4kNmRXZz0OtCSAuzAz0vIAAs80SgnE4eYYkeYynwbSxngd+FpKvs74PoiOXMA/YF3ve7FOGbl4D9JRx2rWcGBIDlVAGHsutx9SL7NAGXA8dhORv9Libd9cxbjHiWOQe4GzD9LkW4agHwHSznU78LyRQ9twXRmuU8jt7K7EO/SxGu+TswXcIhOdKCaE0vO3Yp8Fug3c7QIiPVAD/Acp72u5BMJAHREcscg/7EOcbvUsRueQS4DMtZ43chmUoCYlcs81zgZmTCV6ZZAPwMy5nvdyGZTvogdsVyHgD2BP7pdykiIdvR2yFMlXBIDWlBJMoyZwE3oDszRfr5N3A5lrPK70KyiQREMvTqQmcDvwf28LkaoX0KXIXlvOx3IdlIAqI7LDMfvVfCtUCxz9X0VF8BvwYe72hxWZEaEhC7wzJ7oZdCvwrpyPTKAuB3wEMd7V8hUksCIhUssw96GvnFwESfq8lWn6CD4QlpMXhHAiKVdB/FccAl6DEUMqV89zQDTwN3SB+DPyQg3GKZewI/R7csevtcTaZZDtwF3NOyc5rwiQSE2yyzP/Bd4CzgIKRV0ZkoMA+4A3hJ+hfSgwSElyxzFHBmy9d+PleTDqLAW8BTwKNYTrXP9Yg4EhB+scxx6FbFWcBkn6vxUgPwMjoU5mE5G3yuR+yCBEQ6sMw90Ls4HQHMAkb4W1DKrUbveP0U8AKWs93nekSCJCDSkd7wZxY6ML4FDPG1nuREARt4G3378FbL6l0iA0lApDv96HQssE/L11QghB7q7XeHZzOwGD146TN0ILyH5ci+qFlCAiJT6VGck4BSYDwwFD3se2ir3+/u49VaYHPL12r0KtA7vhYBi7GcyG5eQ6QxCYhspkd4DgWKgFwg2OorJ+7PW9kZBpuBLVhO1IeqRRqRgBBCdEoWjBFCdEoCQgjRKQkIIUSnJCCEEJ2SgBBCdEoCQgjRKQkIIUSnJCCEEJ2SgBBCdEoCQgjRKQkIIUSnJCCEEJ2SgBBCdEoCQgjRKQkIIUSn/j9qAKaW7AupRgAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# 绘制饼图\n",
    "\n",
    "size_count = datas.groupby('size').count()\n",
    "\n",
    "labels = [\"A\", \"B\", \"C\", \"D\", \"E\"]\n",
    "fig, ax = plt.subplots()\n",
    "explode = (0, 0.1, 0, 0, 0)\n",
    "ax.pie(size_count['productColor'], explode=explode, \n",
    "        labels=labels, autopct=\"%1.1f%%\", \n",
    "        radius=1.2, startangle=0)\n",
    "ax.set(aspect='equal')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "### 1997年11月，统计学家吴建福提出将统计学重命名为“数据科学”，同时统计学家应称为“数据科学家”。现在一般认为数据科学（data science）综合了多个领域的理论和技术，包括但不限于统计分析、数据挖掘、机器学习等，其目标是从数据中提取出有价值的部分，应用于相关的数据产品之中。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "py35-paddle1.2.0"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
